Session 1: Lexicons, Corpora, and Evaluation
نویسنده
چکیده
Our technologies for collecting, storing, and disseminating vast amounts of information have gotten ahead of our technologies for collating and analyzing it, and that situation has posed a serious challenge for human language technology. As a consequence, natural language processing has been moving rapidly towards large-scale systems addressed to real tasks. Demos that won't scale up are no longer interesting.
منابع مشابه
XRCE Participation in CLEF 2002
In this paper, we describe the methods we used for the Cross-Lingual Evaluation Forum CLEF 2002, and more specifically for the GIRT Task. The methods are based on (1) the extraction of two bilingual lexicons, one from parallel corpora and the other one from comparable corpora, (2) the optimal combination of these bilingual lexicons in Cross-Language Information Retrieval and (3) the combination...
متن کاملAn Evaluation of the Concept Retrieval Annotation for Spanish-English CLEFER Parallel Corpora
This paper presents a study about the use of the concept retrieval annotation method for parallel corpora. The concept retrieval annotation method (CRA) consists of considering concepts as documents and text chunks as queries [1]. Concepts with higher similarity to text chunks are considered for generating the final semantic annotation. CRA makes use of an existing knowledge resource (KR) from ...
متن کاملEfficient Data Selection for Bilingual Terminology Extraction from Comparable Corpora
Comparable corpora are the main alternative to the use of parallel corpora to extract bilingual lexicons. Although it is easier to build comparable corpora, specialized comparable corpora are often of modest size in comparison with corpora issued from the general domain. Consequently, the observations of word co-occurrences which are the basis of context-based methods are unreliable. We propose...
متن کاملAutomatically Generated Noun Lexicons for Event Extraction
In this paper, we propose a method for creating automatically weighted lexicons of event names. Almost all names of events are ambiguous in context (i.e., they can be interpreted in an eventive or noneventive reading). Therefore, weights representing the relative eventiveness of a noun can help for disambiguating event detection in texts. We applied our method on both French and English corpora...
متن کاملInducing Lexicons of Formality from Corpora
The spectrum of formality, in particular lexical formality, has been relatively unexplored compared to related work in sentiment lexicon induction (Turney and Littman, 2003). In this paper, we test in some detail several corpus-based methods for deriving real-valued formality lexicons, and evaluate our lexicons using relative formality judgments between word pairs. The results of our evaluation...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1994